Theoretical and Applied Genetics
○ Springer Science and Business Media LLC
Preprints posted in the last 30 days, ranked by how well they match Theoretical and Applied Genetics's content profile, based on 46 papers previously published here. The average preprint has a 0.01% match score for this journal, so anything above that is already an above-average fit.
Salomon, J.; Enjalbert, J.; Flutre, T.
Show abstract
The genetics of interspecific groups remains largely unexplored, despite the central role of social (or indirect) genetic effects in shaping phenotypic expression within communities. Intercropping, i.e. the simultaneous cultivation of multiple crop species in the same field, offers a powerful model to harness these interspecific social effects. Such species mixtures provide well-documented agricultural benefits, yet few breeding frameworks have integrated the genetics of social interactions. Here, we address this gap by extending quantitative genetic theory to interspecific groups, with intercropping as a concrete and applied model case. We propose a quantitative genetic model that jointly analyzes intra and interspecific interactions within a unifying framework. Breeding values are decomposed into a direct component, shared in mono and mixed-crops, an interspecific social component corresponding to the effect of one species on another, and an intraspecific component that captures the social effects within a mono-genotypic stand of cloned plants. Statistically, this consists in simultaneously fitting several linear mixed models, one per stand type, all having direct breeding values in common. As no open-source software can fit such a complex mixed model, we provide such an implementation in R/C++. Simulations across various genetic (co)variance structures and sparse experimental designs showed accurate estimation of all genetic (co)variances and breeding values. With an incomplete, yet balanced design combining sole crops and intercrops, genetic gains in both systems were achievable simultaneously, enabling breeding strategies that progressively integrate intercropping into existing, sole-crop-only schemes. More broadly, this framework allows dissecting direct and social genetic effects when genotypes are observed in mono- and mixed-species situations, cultivated or not.
Hodehou, D. A. T.; Diatta, C.; Bodian, S.; Ndour, M.; Sambakhe, D.; Sine, B.; Felderhoff, T.; Diouf, D.; Morris, G. P.; Kane, N. A.; Faye, J. M.
Show abstract
Grain mold severely constrains sorghum [Sorghum bicolor (L.) Moench] productivity and grain quality in subhumid environments. Photoperiod-sensitive flowering plays a key role in mold avoidance and yield stability along north-south rainfall gradients. In response to the high susceptibility of elite cultivars in subhumid zones of Senegal, we developed and characterized a recombinant inbred line (RIL) population derived from Nganda (grain mold-susceptible) and Grinkan (photoperiod-sensitive) varieties. The population was evaluated across three distinct agro-ecological zones over two years. Environmental indices derived from genotype-environmental interactions, together with defined growth windows, strongly influenced flag leaf appearance (FLA), a photoperiodic flowering trait. Plasticity parameters (intercept and slope) for environmental indices, FLA, grain mold severity, and yield enabled identification of loci contributing to flowering response, mold resistance, and yield stability. The maturity gene Ma1 and two QTLs for FLA, qFLA6.2 and qFLA6.3, were identified, stable across environments, and colocalized with grain mold and yield QTLs. The wild-type Ma1 allele from Grinkan delayed FLA and reduced grain mold damage but was not associated with increased yield. The Ma1 effect was confirmed using the developed breeder-friendly KASP marker, Sbv3.1_06_40312464K, in 174 F3 three-way cross families. Photoperiod-sensitive lines with intermediate-to-late FLA alleles showed strong negative associations with mold damage. Overall, the identified stable loci and candidate lines provide foundations for effective molecular breeding of climate-resilient varieties. PLAIN LANGUAGE SUMMARYGrain mold is a fungal disease that reduces sorghum grain yield and quality, particularly in subhumid climates. With the limited number of resistant elite varieties, photoperiod-sensitive flowering to day length variation can contribute to grain mold escape at the end of rainy seasons. We characterized 286 sorghum recombinant inbred lines across three contrasting environments over two years along rainfall gradients in Senegal. Using flag leaf appearance (FLA), which is a photoperiodic flowering trait, strong genotype-environment interactions for FLA and genotypic plasticity were revealed. We identified and validated the common genomic locus associated with FLA variation and its plasticity across environments, the canonical maturity gene Ma1, which was influenced by temperature variation across environments. The presence of Ma1 in the background of photoperiod-sensitive lines enhances grain mold avoidance and yield stability along rainfall gradients in Senegal. CORE IDEASO_LIWe investigated photoperiodic flowering plasticity in sorghum as a contributor to grain mold resistance and yield stability along rainfall gradients. C_LIO_LIThe Maturity locus Ma1 (qFLA6.1) is the major contributor of photoperiodic flowering and its plasticity across semi-arid and subhumid environments. C_LIO_LIHybrid genotypes carrying two stable loci qFLA6.1 and qFLA6.2 sustain high grain mold avoidance in diverse environments. C_LIO_LIPhotoperiod-sensitive lines with medium to late flowering times are effective in avoiding grain mold, while maintaining yield stability in subhumid regions. C_LI
Blois, L.; Heuclin, B.; Bernard, A.; Denis, M.; Dirlewanger, E.; Foulongne-Oriol, M.; Marullo, P.; Peltier, E.; Quero-Garcia, J.; Marguerit, E.; Gion, J.-M.
Show abstract
Deciphering the genetic architecture of complex quantitative phenotypes remains challenging in quantitative genetics. These traits not only depend of multiple genetic factors but are also established over time and environments. Although quantitative genetics has investigated the genetic determinism of phenotypic plasticity in contrasted environmental conditions, the time related phenotypic plasticity has received less attention. Here we proposed a multivariate Bayesian framework, the Bayesian Varying Coefficient Model, designed for analysing the genetic architecture of the time related phenotypic plasticity by a multilocus approach. We applied the BVCM to time series phenotypes measured at various time scales (daily, monthly, yearly) across a diverse set of biological species. We included in this study: yeast (Saccharomyces cerevisiae), fungi (Fusarium graminearum), eucalyptus (Eucalyptus urophylla x E. grandis), and sweet cherry tree (Prunus avium). The BVCM results were compared with those obtained with a known genome-wide association method carried out time by time. For all species and traits, the BVCM was able to detect the major QTL identified by marker-trait association methods and revealed additional genetic regions of weak effect. It also increased the phenotypic variance explained for most of the phenotypes considered. It revealed dynamic QTLs with transitory, increasing or decreasing effects over time. By considering both the temporal and genetic multivariate structures in a single statistical model, we increased our understanding of the genetic architecture of complex traits notably by reducing the issue of missing heritability. More broadly, this work raises the foundation for extended applications in functional genomics, evolutionary ecology, and crop breeding programs, in which time-related phenotypic plasticity remains crucial for predicting and selecting key quantitative complex traits. Key messageBy capturing the genetic factors influencing the time related phenotypic plasticity, our approach contributes to a deeper understanding of the dynamic nature of genotype-phenotype relationships.
Kumar, N.; Singh, B. P.; Mishra, P.; Rani, M.; Gurjar, A.; Mishra, A.; Shah, A.; Gadol, N.; Tiwari, S.; Rathor, S.; Sharma, P. C.; Krishnamurthy, S. L.; Takabe, T.; Mitsuya, S.; Kalia, S.; Singh, N. K.; Rai, V.
Show abstract
Salinity and sodicity stresses adversely affect rice growth and yield. To overcome yield losses, suitable tolerant rice cultivars can be developed through a marker-assisted breeding (MAB) program. In the present study, genomic regions associated with sodicity stress tolerance at the reproductive stage were identified using a high-density 50kSNP array in a recombinant inbred line (RIL) population derived from the contrasting rice genotypes CSR11 and MI48. A total of 50 QTLs were detected for various yield-related traits; further, 19 QTLs with [≥]15% of phenotypic variance were selected for integrated (omics) analysis. RNA sequencing of leaves and panicles at the reproductive stage under sodic stress conditions was employed to find differentially expressed genes. A total of 1368 and 1410 SNPs; 104 and 144 indels were found for MI48 and CSR11, respectively, within the QTL regions from resequencing. At chromosomes 1 and 6, colocalized QTLs (qPH1-1/qGP1-1 and qGP6-2/qSSI6-2) were discovered. Differentially expressed genes (DEGs) were mapped over the QTL regions selected, and SNP variations and indels were screened for colocalized QTLs. Potential candidate genes, namely Os-pGlcT1 (Os01g0133400), OsHKT2;1 (Os06g0701600) and OsHKT2;4 (Os06g0701700), OsANTH12 (Os06g0699800), and OsPTR2 (Os06g0706400), were identified as being responsible for glucose transport, ion homeostasis, pollen germination, and nitrogen use efficiency, respectively, under salt stress. Finally, our study provides important insights into the genes and potential mechanisms affecting grain yield under sodic stress in rice, which will contribute to the development of molecular markers for rice breeding programs.
Lev-Mirom, Y.; Avni, R.; Nave, M.; Kulikovsky, S.; Oren, L.; Eilam, T.; Sela, H.; Distelfeld, A.
Show abstract
The transition from hulled to free-threshing grain was a pivotal event in wheat domestication, enabling efficient harvesting and processing. Threshability in tetraploid wheat is controlled primarily by the Q locus and two Tenacious glume (Tg) loci on chromosomes 2A and 2B, yet the molecular basis of the major Tg1-B locus remains incompletely characterized. Here, we phenotyped a durum wheat x wild emmer wheat (WEW) recombinant inbred line (RIL) population across two field environments and performed QTL analysis for glume tenacity (TG), threshability ratio (THRR), and seed number per spike (SDNPS). A total of 19 significant QTLs were detected across six chromosomes. The largest-effect loci for both TG and THRR co-localized on chromosome 2B, with LOD scores up to 14.22 and phenotypic variance explained up to 31.2%, corresponding to the previously described Tg1-B locus. To validate this QTL, the donor RIL was backcrossed three times to Svevo to generate a near-isogenic line, NIL-65 (BC3F5), confirmed by whole-genome skim sequencing to carry a homozygous WEW introgression at Tg1-B. A segregating BC4F2 population derived from NIL-65 confirmed that plants homozygous for the dominant Tg1-B allele displayed significantly higher glume tenacity and intact glume morphology compared to tg1-B sister lines, which exhibited basal glume cracking characteristic of the free-threshing phenotype. Genotyping-by-sequencing delimited the causal interval to an approximately 11 Mb introgression on chromosome 2B. These results confirm the major role of Tg1-B in determining glume tenacity in tetraploid wheat, provide a validated near-isogenic germplasm resource, and lay the foundation for fine-mapping and functional characterization of the underlying gene(s).
Monyak, T.; Morris, G.
Show abstract
Global networks of crop breeding programs leverage diverse germplasm, but diversity increases the complexity of maintaining stability in their elite genepools. To characterize genetic heterogeneity in breeding metapopulations and develop insights on how to manage it, we simulated the evolution of breeding populations on fitness landscapes. We revealed the geometric decrease in the average effect size of alleles segregating as standing variation that become fixed along an adaptive walk. We also demonstrated how independent adaptive walks of subpopulations are influenced by genetic drift, leading to cryptic genetic heterogeneity among elite genepools. This variation is released when elite lines derived from independent subpopulations are crossed, leading to segregation for 2-4X more major QTL in admixed families as in unadmixed families, and 2-4X more epistatic interactions. The emergent property of fitness epistasis for traits under stabilizing selection is well-understood in evolutionary genetics, but under-appreciated in crop quantitative genetics. To highlight the importance of this phenomenon, we constructed an empirical genotype-to-fitness landscape from the sorghum NAM, a global admixed prebreeding resource, demonstrating the utility of fitness landscapes for inferring genetic compatibilities within metapopulations. Our findings suggest that in breeding networks, strategies for effective germplasm exchange must account for epistasis in the oligogenic component of the genetic architecture of locally-adapted traits. Article summaryModern public sector crop improvement happens in networks of breeding programs that routinely exchange genetic information. Traditional models for understanding quantitative traits have limited predictiveness in situations with such genetic heterogeneity. This study uses breeding simulations and empirical data to show the utility of the fitness landscape framework for characterizing the genetic architecture of complex traits in breeding metapopulations. By simulating the evolution of breeding programs and integration into networks, it demonstrates how epistatic interactions between large-effect alleles are a fundamental property that must be accounted for when exchanging germplasm. Graphical Abstract O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=102 SRC="FIGDIR/small/712732v1_ufig1.gif" ALT="Figure 1"> View larger version (25K): org.highwire.dtl.DTLVardef@1541326org.highwire.dtl.DTLVardef@b553a8org.highwire.dtl.DTLVardef@8758b4org.highwire.dtl.DTLVardef@1d0bdcd_HPS_FORMAT_FIGEXP M_FIG C_FIG
Robles-Zazueta, C. A.; Strack, T.; Schmidt, M.; Callipo, P.; Robinson, H.; Vasudevan, A.; Voss-Fels, K.
Show abstract
Grapevine cluster architecture is a key selection target in breeding programs because it influences disease susceptibility, yield stability and juice quality. High-throughput phenotyping offers a rapid and non-destructive approach to capture biochemical and structural variation in these traits, yet the influence of plant organ reflectance and data partitioning strategies on trait prediction remains poorly understood. In this study, we evaluated how hyperspectral reflectance from different grapevine organs contributes to the prediction of cluster architecture and juice quality traits in two clonal populations of Riesling and Pinot. Using partial least squares regression (PLSR), we assessed the prediction accuracy of eight cluster architecture and six juice quality traits under two data partitioning strategies. Models based on cluster reflectance outperformed those using dry leaf reflectance for most traits, except for pH. Partitioning the dataset by cluster type increased trait variance and improved predictions for number of berries (R{superscript 2} = 0.53), berry diameter (R{superscript 2} = 0.79), and total acidity (R{superscript 2} = 0.48). Visible, red-edge and NIR spectra were most informative regions to predict the traits studied. Together, our results highlight the importance of organ-specific data and appropriate calibration strategies to improve phenomic models for the development of scalable proxies for grapevine improvement. HighlightSpectral phenomics reveals that prediction accuracy in grapevine depends on organ spectral signatures and traits, with cluster reflectance outperforming leaves, informing new phenotyping strategies for breeding improvement.
Murakami, K.; Narihiro, T.; Horikoshi, M.; Matsuhira, H.; Kuroda, Y.
Show abstract
Improving photosynthesis is a promising approach to enhance sugar beet productivity. However, genetic variation in leaf photosynthesis and its relationship with disease resistance remain underexplored. We evaluated 98 sugar beet genotypes representing different breeding categories, including commercial F1 hybrids, seed-parent lines, and pollinator lines, in Hokkaido, northern Japan. Leaf gas exchange was measured during early growth under field conditions around the infection period of Cercospora leaf spot (CLS). To account for fluctuating irradiance during large-scale phenotyping, we applied a multilevel mixed-effects light-response model to estimate genotype-specific photosynthetic characteristics. Substantial genotypic variations in photosynthetic characteristics were detected. F1 hybrids exhibited higher photosynthetic capacity than breeding lines, whereas differences among breeding categories were unclear due to large within-category variation. Some breeding lines exhibited photosynthetic rates higher than those of hybrids, indicating exploitable genetic resources within the present genetic panel. We did not detect statistically significant trade-off between leaf photosynthesis and CLS resistance among 98 genotypes; in a subset of 19 genotypes analysed in detail, the relationship was even synergistic. Our results highlight the genetic diversity of leaf photosynthesis and its category-dependent structure, and suggest that selection for enhanced photosynthesis can proceed without substantial trade-off with CLS resistance. HighlightLeaf photosynthesis of 98 sugar beet genotypes showed significant genetic variation and dependence on breeding category. Active photosynthesis incurred minimal trade-off with Cercospora leaf spot resistance.
Kimura, K.; Yamaguchi, T.; Matsui, T.
Show abstract
Heat-tolerant rice cultivars are essential for mitigating global warming impacts. Basal anther dehiscence length (BDL) is a promising visible morphological marker for heat tolerance through stable pollination. We investigated the effects of sowing date on anther morphology, pollination, and fertility under controlled high-temperature conditions (35, 37, or 39 {degrees}C at flowering). Three japonica cultivars-- Akitakomachi (early heading), Koshihikari (medium), and Hatsushimo (late)--were sown monthly over 3 months and grown in pots. At heading, the plants were exposed to the temperature treatments for 3 days, and the proportion of florets with [≥]10 germinated pollen grains on the stigma (GP10) and seed set were assessed. Among anther traits, BDL showed the greatest variation, with all cultivars from the second sowing exhibiting the shortest BDL. Analysis of variance revealed significant effects of genotype, sowing date, and their interaction on anther traits and fertility. Regression analysis indicated that fertility was associated with GP10, with BDL contributing significantly to GP10 in the late-heading Hatsushimo, together with maximum temperature at flowering. Thus, both genotype and environment shape anther morphology, pollination, and fertility, indicating that BDL plasticity and genotype-specific environmental responses must be carefully considered when using BDL as a breeding marker for heat tolerance. HighlightVariation in sowing date significantly affects anther morphology and heat tolerance in rice. Genotype-specific responses to the growing environment require careful consideration for reliable breeding assessments.
Proma, S.; Garcia-Abadillo, J.; Sagae, V. S.; Sacks, E.; Leakey, A. D. B.; Zhao, H.; Ghimire, B. K.; Lipka, A. E.; Njuguna, J. N.; Yu, C. Y.; Seong, E. S.; Yoo, J. H.; Nagano, H.; Anzoua, K. G.; Yamada, T.; Chebukin, P.; Jin, X.; Clark, L. V.; Petersen, K. K.; Peng, J.; Sabitov, A.; Dzyubenko, E.; Dzyubenko, N.; Glowacka, K.; Nascimento, M.; Campana Nascimento, A. C.; Dwiyanti, M. S.; Bagment, L.; Shaik, A.; Jarquin, D.
Show abstract
Genomic selection holds the potential to serve as a strategic tool to enhance the genetic gain of complex traits in Miscanthus breeding programs. The development of improved cultivars requires their assessment for various traits across diverse environments to ensure suitable overall performance. Hence, the multi-trait multi-environment (MTME) genomic prediction (GP) models offer an opportunity to improve selection accuracy. This study aims to evaluate the potential of five GP models: (1) three MTME models including genotype-by-trait-by-environment interaction (GxExT) and (2) two single-trait multi-environment (STME) models (with and without GxE interaction). A Miscanthus sacchariflorus population comprising 336 genotypes evaluated in three environments and scored for four traits (biomass yield YDY, total culm number TCM, average internode length AIL, and culm node number CNN) was analyzed. The predictive ability of the models was evaluated considering three cross-validation schemes resembling realistic scenarios (CV1: predicting new genotypes, CVP: predicting missing traits in a given environment, and CV2: predicting partially observed genotypes). On average, in all cross-validation schemes compared to the STME the predictive ability of the MTME models was 10% to 70% higher for TCM and AIL. On the other hand, for YDY and CNN, both STME models performed similarly or slightly better (between 5 to 64%) than the MTME models in most environments. While the MTME models were not successful for all traits when compared to their STME counterparts, MTME models improved the prediction of the performance of genotypes that were untested across environments or lacked trait information in a specific environment. Overall, our study suggests that MTME GP models can be implemented in Miscanthus breeding programs to improve the predictive ability of the complex traits, shorten breeding cycles, and accelerate selection decisions.
Sato, Y.; Hamazaki, K.
Show abstract
Individual phenotypes often depend on the genotypes of other individuals within a group. These phenomena are termed indirect genetic effects (IGEs) and have been distinguished from direct genetic effects (DGEs) using quantitative genetic models. Recent studies have utilized high-resolution polymorphism data to enable genomic prediction (GP) and genome-wide association study (GWAS) of IGEs, but unified methods remain limited. Here we integrate polygenic and oligogenic IGEs using a multi-kernel mixed model incorporating two random effects with a single covariance parameter. Underlying this implementation, the Ising model of ferromagnetics enabled us to simplify locus-wise and background IGEs for GWAS and GP, respectively. Our simulations demonstrated that, while the previous and present models exhibited similar performance, the present model can infer a trade-off between DGEs and IGEs. By applying this method to three species of woody plants, we found evidence for intergenotypic competition in aspen and apple trees, but limited evidence in climbing grapevines. Based on GWAS, we also detected significant variants associated with the competitive IGEs on the apple trunk growth. Our study offers a flexible implementation for GWAS/GP of IGEs, thereby providing an effective tool to dissect the genetic architecture of group performance.
Lapous, R.; Haquet, C.; Denance, C.; Benejam, J.; Perchepied, L.; Hellyn, K.; Muranty, H.; Durel, C.-E.; Ferreira de Carvalho, J.
Show abstract
Apple scab, caused by Venturia inaequalis, remains one of the most damaging diseases in apple orchards, driving intensive pesticide use worldwide. Reducing this dependence requires the deployment of durable resistance, ideally through the combination of major resistance genes (R genes) with quantitative trait loci (QTL) that confer partial and potentially complementary protection. Yet, few apple scab QTLs have been functionally validated, and their underlying mechanisms remain largely unresolved. Here, we refined and functionally described, with transcriptomic data, five resistance QTLs in a biparental population of 1,970 individuals derived from the cross TN 10-8 x Fiesta. Using 43 newly developed KASP markers, QTL locations were substantially precised through high-resolution genotyping and phenotyping with two V. inaequalis isolates exhibiting contrasting virulence. Four QTL (qT1, qF11, qF17, qT13) were validated, while qF3 was not confirmed. Transcriptomic data comparison revealed the expression of candidate genes within the narrowed intervals, including receptor-like proteins in qT1, and RNAi- and signaling-related genes in qF11 and qF17, suggesting a diversified and complementary defense network. These findings refine the genetic architecture of apple scab resistance and suppose the existence of shared molecular pathways between major R gene, such as the well-described Rvi6 gene, and quantitative resistance, with for instance the QTL qT1. The identified loci and markers provide robust tools for marker-assisted and genomic breeding aimed at developing apple cultivars with complementary and potentially durable resistance pathways.
Chaplin, E. D.; Tanaka, E.; Merchant, A.; Sznajder, B.; Trethowan, R.; Salter, W. T.
Show abstract
Stomatal traits balance carbon gain with water loss, yet their breeding potential in wheat remains underexploited. This study investigated physiological and anatomical stomatal responses alongside yield across two years of large-scale field trials under water-limitation and delayed sowing-induced heat exposure. Across both seasons, stomatal conductance (gs) declined under stress, reflecting strong environmental constraint on gas-exchange (water-limitation: -26.9%; heat: -13.8%). Partitioning responses by leaf surface and genotype identified the adaxial surface as the dominant contributor to gs variation and the most stress responsive. Despite increases in theoretical anatomical gas-exchange capacity (gsmax), gs-efficiency declined, indicating partial decoupling between structural potential and realised conductance. Drought reduced stomatal size while increasing density whereas heat increased size, suggesting stress-specific anatomical plasticity. Moderate-to-high heritability was observed for anatomical traits (Water-limitation: 0.13-0.57; Heat: 0.42-0.71), contrasting with lower and less stable heritability for gs (water-limitation: 0.13-0.41; heat: 0.13-0.50). Genome-wide-association-mapping identified 169 putative QTLs, predominantly for anatomical traits, including stable and co-localised pleiotropic loci. Fourteen sets of closely positioned markers were detected across seasons or studies, with stable regions on chromosomes 2B, 3B and 7B emerging as key loci. Focusing on stable loci controlling adaxial stomatal anatomy offers a realistic strategy to enhance wheat photosynthetic efficiency and climate resilience. HighlightAdaxial stomatal traits dominate gas exchange responses to heat and drought in wheat, with stable anatomical QTL identified on chromosomes 2B, 3B and 7B. Their stability across environments supports their relevance for crop improvement in water-limited and high temperature systems.
Lourenco, V. M.; Ogutu, J. O.; Piepho, H.-P.
Show abstract
Data contamination--from recording errors to extreme outliers--can compromise statistical models by biasing predictions, inflating prediction errors, and, in severe cases, destabilizing performance in high-dimensional settings. Although contamination can affect responses and covariates, we focus on response contamination and evaluate Random Forests through simulation. Using a synthetic animal-breeding dataset, we assess robust Random Forests across several contamination scenarios and validate them on plant and animal datasets. We thereby clarify the consequences of contamination for prediction, develop a robust Random Forest framework, and evaluate its performance. We examine preprocessing or data-transformation strategies, algorithmic modifications, and hybrid approaches for robustifying Random Forests. Across these approaches, data transformation emerges as the most effective strategy, delivering the strongest performance under contamination. This strategy is simple, general, and transferable to other Machine Learning methods, offering a remedy for robust genomic prediction. In real breeding data, robust Random Forests are useful when substantial contamination, phenotypic corruption, misrecording, or train-deployment mismatch is plausible and the goal is to recover a latent signal for genomic prediction and selection; ranking-based robust Random Forests are the dependable first option, whereas weighting-based Random Forests should be used only when their weighting scheme preserves rank structure and improves prediction. Robustification is not universally necessary, but it becomes important when contamination distorts the link between observed responses and the predictive target; standard Random Forests remain the default for clean data, whereas robust Random Forests should be fitted alongside them whenever contamination is plausible, with the final choice guided by data, trait, and breeding objective. Author summaryMachine learning (ML) methods are widely used for prediction with high-dimensional, complex data, and supervised approaches such as Random Forests (RF) have proved effective for genomic prediction (GP) and selection. Yet their performance can be severely compromised by data contamination if the algorithms rely on classical data-driven procedures that are sensitive to atypical observations. Robustifying ML methods is therefore important both for improving predictive performance under contamination and for guiding their practical use in high-dimensional prediction problems. To address this need, we develop robust preprocessing, algorithm-level, and hybrid strategies for improving RF performance with contaminated data. Using simulated animal data, we show that ranking-and weighting-based robust RF provide the strongest overall compromise for genomic prediction and selection under contamination. Validation on several plant and animal breeding datasets further shows that the benefits of robustification are not universal, but depend on the dataset, trait, and breeding objective. Although motivated by RF, the framework we propose is general, practical, and readily transferable to other ML methods. It also offers a basis for deciding when robustness should complement standard RF rather than replace it outright.
Proma, S.; Lubanga, N.; Sacks, E.; Leakey, A. D. B.; Zhao, H.; Ghimire, B. K.; Lipka, A. E.; Njuguna, J. N.; Yu, C. Y.; Seong, E. S.; Yoo, J. H.; Nagano, H.; Anzoua, K. G.; Yamada, T.; Chebukin, P.; Jin, X.; Clark, L. V.; Petersen, K. K.; Peng, J.; Sabitov, A.; Dzyubenko, E.; Dzyubenko, N.; Glowacka, K.; Nascimento, M.; Campana Nascimento, A. C.; Dwiyanti, M. S.; Bagment, L.; Shaik, A.; Garcia-Abadillo, J.; Jarquin, D.
Show abstract
Phenotyping high-biomass perennial crops is laborious and the rate of genetic gain in perennial crop breeding programs is typically low. So, it is especially important to identify methods that produce efficiency gains in the breeding process. Miscanthus is a C4 perennial grass with favorable characteristics for producing biomass as a feedstock for biofuels and diverse biobased products. Increasing biomass yield will increase profitability and environmental benefits, so is a key target for Miscanthus breeding. In addition, the identification of well-adapted genotypes across a wide range of environmental conditions requires the establishment of multi-environment trials (METs). Sparse testing is a genomic prediction-based strategy that reduces the phenotyping costs in METs by selecting a subset of genotypes to evaluate in a subset of environments and then predicts the performance of the unobserved genotype-environment combinations. A Miscanthus sacchariflorus (MSA) population comprising 336 genotypes observed across three environments was analyzed. Three prediction models considering main effects (environments, genotypes, genomic) and interaction effects (genotype-by-environment; GxE interaction) were implemented for forecasting dry biomass yield (YDY), total culm (TCM), average internode length (AIL), and culm node number (CNN). Multiple calibration sets based on different compositions and sizes were considered to evaluate performance in terms of the predictive ability (PA) and the mean square error (MSE) for a fixed testing set size. The training set size ranged from 52 to 112 to predict a fixed set of 224 unobserved genotypes across all three environments. The results showed that the model accounting for GxE interaction presented the highest PA and the lowest MSE for CNN (PA: [~]0.77, MSE: [~]0.5) and YDY (PA: [~]0.70, MSE: [~]1.3) while for TCM and AIL these ranged from [~]0.28 to 0.41 and [~]1.3 to 4.3, respectively. Overall, varying training sets and allocation strategies did not affect PA and MSE, with 52 non-overlapping and 0 overlapping genotypes per environment as the optimal cost-effective allocation framework. This suggests that implementing sparse testing designs could significantly reduce phenotyping costs by fivefold, without compromising PA in breeding programs for perennial crops such as Miscanthus.
Kottelenberg, D. B.; Morales, A.; Anten, N. P. R.; Bastiaans, L.; Evers, J. B.
Show abstract
In cereal-legume intercrops, weed suppression is primarily driven by cereals, whose competitiveness is shaped by trait plasticity--morphological adjustments in response to the intercrop environment. However, how individual cereal traits respond plastically and contribute to system performance remains unclear, hampering improvements through breeding or system design. We combined field experiments with functional-structural plant modelling to quantify plastic responses of four cereal traits (tiller number, tiller angle, specific leaf area (SLA), and specific internode length (SIL)) and their effects on weed suppression and crop productivity. Field measurements revealed plasticity in tiller number, tiller angle, and SIL between sole crops and intercrops, while SLA showed minimal differences. Simulations showed that intermediate tiller numbers resulted in the strongest weed suppression and highest productivity, indicating an optimum, while more horizontal tillers suppressed weeds slightly better than vertical ones. Weed suppression increased with higher SLA values, while SIL showed a saturating response, increasing to intermediate SIL values and plateauing thereafter. In simulations with short-statured cereal phenotypes (low SIL), the reduction in cereal weed suppression was compensated by the legume component. This study demonstrates how FSP modelling can be used to investigate trait plasticity mechanisms and generate testable hypotheses about trait effects in complex intercrop systems. HighlightCereal trait plasticity shapes weed suppression in cereal-legume intercrops, with distinct response patterns per trait, while legumes can compensate for weakly competitive cereals, suggesting balanced competition over cereal dominance.
Gregoire, M.; Pateyron, S.; Brunaud, V.; Tamby, J. P.; Benghelima, L.; Martin, M.-L.; Girin, T.
Show abstract
AO_SCPLOWBSTRACTC_SCPLOWNitrogen fertilizers are essential for crop productivity but cause environmental harm, necessitating the development of cultivars that thrive under limited nitrogen. This study investigates the transcriptomic response to nitrate in Arabidopsis thaliana (a model dicot), Brachypodium distachyon (a model Pooideae), and Hordeum vulgare (barley, a domesticated Pooideae) to identify conserved and species-specific molecular mechanisms. Using RNA-seq after 1.5 and 3 hours of nitrate treatment, we found that core nitrate-responsive biological processes - such as nitrate transport, assimilation, carbon metabolism, and hormone signaling - are largely conserved across species. However, comparative analysis at gene level based on orthology revealed specificities between the species. For instance, rRNA processing was uniquely stimulated in Arabidopsis, while cysteine biosynthesis from serine and gibberellin biosynthesis were specifically regulated in Brachypodium and barley. Orthologs of key nitrate-responsive genes (e.g., NRT, NLP, TCP20) exhibited variable regulation, reflecting potential adaptations linked to domestication or nutrient acquisition strategies. These findings highlight the importance of integrating model and crop species to uncover targets for improving nitrogen use efficiency in cereals. The study provides a pipeline integrating gene ontology and orthology analyses to compare transcriptomic responses between species.
Tressel, L. G.; Caspersen, A. M.; Walling, J. G.; Gao, D.
Show abstract
Barley (Hordeum vulgare L.) is an important crop in the world and its seed dormancy is primarily controlled by a Mitogen-Activated Protein Kinase Kinase 3 (MKK3) gene. Although kinase activity of MKK3 and its roles in barley post-domestication have been widely studied, the pre-domestication evolution of MKK3 and the spread of nondormant alleles among global barley varieties remain largely unexplored. In this study, we analyzed MKK3 sequences in barley and its wild progenitor (H. spontaneum) and identified two polymorphic miniature inverted-repeat transposable elements (MITEs). Comparative analyses indicated that the insertions/excision of the MITEs predated the current estimates of barley domestication. Examination of the barley pangenomes coupled with droplet digital (dd) PCR revealed extensive copy number variation of MKK3 and suggested that transposons likely drove tandem amplification of the MKK3 gene on chromosome 5H. Additionally, approximately 1-Kb MKK3 sequences were found on chromosomes 1H and 6H. Further analysis indicated that these short MKK3 sequences were captured by a CACTA transposon that also contained fragments from four other expressed genes. The acquisition of MKK3 was estimated to be between 1.9-2.5 million years ago. Together, these findings illuminate the dynamic pre-domestication evolution of the MKK3 gene and suggest three independent origins of highly nondormant barley worldwide including a unique lineage predominant in Ethiopian germplasm. This study reveals the pivotal roles of transposons in MKK3 evolution and provide helpful information for understanding the complex history of MKK3 gene in barley and also for improving preharvest sprouting (PSH) tolerant varieties under distinct natural conditions.
Djemal, R.; Trabelsi, R.; Ghazala, I.; Ebel, C.; Messerer, M.; Boukouba, R.; Gdoura-Ben Amor, M.; Charfeddine, S.; Elleuch, A.; Gdoura, R.; Mayer, K. F. X.; Winkler, J. W. B.; Schnitzler, J.-P.; Hanin, M.
Show abstract
Drought is a major constraint on the productivity of durum wheat across Mediterranean and North African regions. To elucidate the mechanisms underlying drought resilience, we employed a combination of scenario-controlled phenomics and flag leaf transcriptomics across ten durum wheat genotypes. These included the Tunisian landraces Chili and Mahmoudi, seven breeding lines, and the reference cultivar Svevo. The plants were grown to maturity under well-watered or long-term drought conditions in pots and rhizotrons, enabling a comprehensive assessment of growth, yield components, root architecture, physiological traits, and reaction norm plasticity. Drought markedly reduced performance, yet Chili and Mahmoudi consistently maintained superior biomass, grain number and intrinsic water use efficiency (iWUE). This was supported by balanced C/N allocation, strong osmotic adjustment, and the ability to sustain robust root systems under stress, albeit through partly divergent physiological strategies. Transcriptomic profiling revealed highly genotype specific responses, with drought tolerance unrelated to the number of differentially expressed genes. Instead, the landraces displayed distinct regulatory programs involving mainly photosynthesis protection, ABA-related transporters, osmotic adjustment pathways, and stress-responsive transcription factors. These mechanistic insights identify actionable physiological and molecular determinants of drought plasticity and provide high value targets for accelerating the breeding of climate resilient durum wheat. HighlightsIntegrated phenomics and transcriptomics revealed landrace-specific physiological and molecular mechanisms enabling superior drought resilience and identifying actionable targets for durum wheat improvement.
Halpin-McCormick, A.; Nalla, M. K.; Radlicz, Z.; Zhang, A.; Fumia, N.; Lin, T.-h.; Lin, S.-w.; Wang, Y.-w.; Zohoungbogbo, H. P. F.; Wang, D. R.; Runck, B.; Gore, M. A.; Kantar, M. B.; Barchenger, D. W.
Show abstract
Climate change increasingly threatens global Capsicum (pepper) production. Accelerating the deployment of climate-resilient cultivars requires effective use of genetic diversity conserved in genebanks. We implement a "turbocharging" strategy in Capsicum by integrating genome-wide association studies and genomic prediction in a core collection (n = 423), followed by genomic prediction across the global collection (n = 10,250) using the core as a training population. We generated genomic estimated breeding values (GEBVs) for 31 high-accuracy traits (r > 0.5) encompassing hyperspectral phenotypes (heat/control), agronomic performance (heat/control) and fruit quality. To enhance accessibility and decision-making, we developed a large language model (LLM) integrated application that enables flexible, preference-based selection of candidates. By narrowing the parental decision space, this framework streamlines screening of large germplasm collections while balancing climate resilience, quality attributes and market demands. Our approach provides a scalable decision-support system to accelerate climate-resilient Capsicum breeding and maximize global genetic resources.